868 research outputs found

    Informed algorithms for sound source separation in enclosed reverberant environments

    Get PDF
    While humans can separate a sound of interest amidst a cacophony of contending sounds in an echoic environment, machine-based methods lag behind in solving this task. This thesis thus aims at improving performance of audio separation algorithms when they are informed i.e. have access to source location information. These locations are assumed to be known a priori in this work, for example by video processing. Initially, a multi-microphone array based method combined with binary time-frequency masking is proposed. A robust least squares frequency invariant data independent beamformer designed with the location information is utilized to estimate the sources. To further enhance the estimated sources, binary time-frequency masking based post-processing is used but cepstral domain smoothing is required to mitigate musical noise. To tackle the under-determined case and further improve separation performance at higher reverberation times, a two-microphone based method which is inspired by human auditory processing and generates soft time-frequency masks is described. In this approach interaural level difference, interaural phase difference and mixing vectors are probabilistically modeled in the time-frequency domain and the model parameters are learned through the expectation-maximization (EM) algorithm. A direction vector is estimated for each source, using the location information, which is used as the mean parameter of the mixing vector model. Soft time-frequency masks are used to reconstruct the sources. A spatial covariance model is then integrated into the probabilistic model framework that encodes the spatial characteristics of the enclosure and further improves the separation performance in challenging scenarios i.e. when sources are in close proximity and when the level of reverberation is high. Finally, new dereverberation based pre-processing is proposed based on the cascade of three dereverberation stages where each enhances the twomicrophone reverberant mixture. The dereverberation stages are based on amplitude spectral subtraction, where the late reverberation is estimated and suppressed. The combination of such dereverberation based pre-processing and use of soft mask separation yields the best separation performance. All methods are evaluated with real and synthetic mixtures formed for example from speech signals from the TIMIT database and measured room impulse responses

    Single channel speech enhancement by colored spectrograms

    Full text link
    Speech enhancement concerns the processes required to remove unwanted background sounds from the target speech to improve its quality and intelligibility. In this paper, a novel approach for single-channel speech enhancement is presented, using colored spectrograms. We propose the use of a deep neural network (DNN) architecture adapted from the pix2pix generative adversarial network (GAN) and train it over colored spectrograms of speech to denoise them. After denoising, the colors of spectrograms are translated to magnitudes of short-time Fourier transform (STFT) using a shallow regression neural network. These estimated STFT magnitudes are later combined with the noisy phases to obtain an enhanced speech. The results show an improvement of almost 0.84 points in the perceptual evaluation of speech quality (PESQ) and 1% in the short-term objective intelligibility (STOI) over the unprocessed noisy data. The gain in quality and intelligibility over the unprocessed signal is almost equal to the gain achieved by the baseline methods used for comparison with the proposed model, but at a much reduced computational cost. The proposed solution offers a comparative PESQ score at almost 10 times reduced computational cost than a similar baseline model that has generated the highest PESQ score trained on grayscaled spectrograms, while it provides only a 1% deficit in STOI at 28 times reduced computational cost when compared to another baseline system based on convolutional neural network-GAN (CNN-GAN) that produces the most intelligible speech.Comment: 18 pages, 6 figures, 5 table

    Printed Sleeve Monopole Antenna

    Get PDF

    Nitrous Oxide in Oxygen and Air in Oxygen for Perioperative Analgesia : A Comparative study

    Get PDF
    Background: To determine that additional dose of nalbuphine is required while using medical air instead of nitrous oxide in oxygen to maintain anaesthesia so that inadequate intra-operative analgesia could be avoided. Methods: This quasi experimental study was carried out in the Department of Anaesthesia, Holy Family Hospital, Rawalpindi, from October 2007 to March 2008. One hundred patients were selected by non probability convenient sampling. Patients between 20 to 40 years of age were included, belonging to ASA Class-I and II. They were divided into two groups (A and B) scheduled for different elective surgical procedures under general anaesthesia. Group A comprised of fifty patients who received medical air in oxygen. Group B comprised of fifty patients who received nitrous oxide in oxygen. The conduct of anaesthesia was kept same in both the groups. Patients heart rate, mean arterial pressure, pulse oximetry, ECG were monitored and requirement of additional dose of nalbuphine in both the groups was noted. Intra-operative tachycardia and hypertension indicated additional dose of nalbuphine. Average value of heart rate and blood pressure of each case was determined and the data compared and analyzed by SPSS-10. Results: Forty patients in group A did not require intra-operative additional nalbuphine while the remaining ten patients required it. Forty eight patients in group B did not require additional intra-operative nalbuphine and only two patients required it. Conclusion: The use of nitrous oxide significantly reduces the intra-operative narcotic analgesia requirement

    MaPLe: Multi-modal Prompt Learning

    Full text link
    Pre-trained vision-language (V-L) models such as CLIP have shown excellent generalization ability to downstream tasks. However, they are sensitive to the choice of input text prompts and require careful selection of prompt templates to perform well. Inspired by the Natural Language Processing (NLP) literature, recent CLIP adaptation approaches learn prompts as the textual inputs to fine-tune CLIP for downstream tasks. We note that using prompting to adapt representations in a single branch of CLIP (language or vision) is sub-optimal since it does not allow the flexibility to dynamically adjust both representation spaces on a downstream task. In this work, we propose Multi-modal Prompt Learning (MaPLe) for both vision and language branches to improve alignment between the vision and language representations. Our design promotes strong coupling between the vision-language prompts to ensure mutual synergy and discourages learning independent uni-modal solutions. Further, we learn separate prompts across different early stages to progressively model the stage-wise feature relationships to allow rich context learning. We evaluate the effectiveness of our approach on three representative tasks of generalization to novel classes, new target datasets and unseen domain shifts. Compared with the state-of-the-art method Co-CoOp, MaPLe exhibits favorable performance and achieves an absolute gain of 3.45% on novel classes and 2.72% on overall harmonic-mean, averaged over 11 diverse image recognition datasets. Our code and pre-trained models are available at https://github.com/muzairkhattak/multimodal-prompt-learning.Comment: Accepted at CVPR202

    Fine-tuned CLIP Models are Efficient Video Learners

    Full text link
    Large-scale multi-modal training with image-text pairs imparts strong generalization to CLIP model. Since training on a similar scale for videos is infeasible, recent approaches focus on the effective transfer of image-based CLIP to the video domain. In this pursuit, new parametric modules are added to learn temporal information and inter-frame relationships which require meticulous design efforts. Furthermore, when the resulting models are learned on videos, they tend to overfit on the given task distribution and lack in generalization aspect. This begs the following question: How to effectively transfer image-level CLIP representations to videos? In this work, we show that a simple Video Fine-tuned CLIP (ViFi-CLIP) baseline is generally sufficient to bridge the domain gap from images to videos. Our qualitative analysis illustrates that the frame-level processing from CLIP image-encoder followed by feature pooling and similarity matching with corresponding text embeddings helps in implicitly modeling the temporal cues within ViFi-CLIP. Such fine-tuning helps the model to focus on scene dynamics, moving objects and inter-object relationships. For low-data regimes where full fine-tuning is not viable, we propose a `bridge and prompt' approach that first uses fine-tuning to bridge the domain gap and then learns prompts on language and vision side to adapt CLIP representations. We extensively evaluate this simple yet strong baseline on zero-shot, base-to-novel generalization, few-shot and fully supervised settings across five video benchmarks. Our code is available at https://github.com/muzairkhattak/ViFi-CLIP.Comment: Accepted at CVPR 202

    Harmonic Scalpel Hemorrhoidectomy Vs Milligan-Morgan Hemorrhoidectomy

    Get PDF
    Background: To compare Harmonic Scalpel Hemorrhoidectomy (HSH) with classical Milligan Morgan Hemorrhoidectomy (MMH) in terms of operation time and post-operative pain to establish effectiveness of this novel procedure.Methods: A total of 62 patients planned for excision hemorrhoidecotmy were randomly selected into HSH and MMH groups. Mean operation time was calculated during surgery and pain at time of first defecation was recorded on visual analog scale (VAS).Results: Mean VAS after surgery at time of first defecation was 4.32 (SD 0.909) in HSH group and 6.97 (SD 1.426) in MMH group (p value <0.000). Mean Operation time in HSH group was 18.13 (SD 3.956) minutes and that of MMH group was 22.90 (SD 4.901) minutes (P value <0.000).Conclusion: Harmonic Scalpel Hemorrhoidectomy is better than Milligan Morgan hemorrhoidectom

    A Dynamically Consistent Nonstandard Difference Scheme for a Discrete-Time Immunogenic Tumors Model

    Get PDF
    This manuscript deals with the qualitative study of certain properties of an immunogenic tumors model. Mainly, we obtain a dynamically consistent discrete-time immunogenic tumors model using a nonstandard difference scheme. The existence of fixed points and their stability are discussed. It is shown that a continuous system experiences Hopf bifurcation at one and only one positive fixed point, whereas its discrete-time counterpart experiences Neimark–Sacker bifurcation at one and only one positive fixed point. It is shown that there is no chance of period-doubling bifurcation in our discrete-time system. Additionally, numerical simulations are carried out in support of our theoretical discussion.Spanish Government and European Commission, Grant RTI2018-094336-B-I00 (MCIU/AEI/FEDER, UE); Basque Government, Grant IT1207-19

    Targeted Genome Editing for Cotton Improvement

    Get PDF
    Conventional tools induce mutations randomly throughout the cotton genome—making breeding difficult and challenging. During the last decade, progress has been made to edit the gene of interest in a very precise manner. Targeted genome engineering with engineered nucleases (ENs) specifically zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeat (CRISPR) RNA-guided nucleases (e.g., Cas9) has been described as a “game-changing technology” for diverse fields as human genetics and plant biotechnology. In eukaryotic systems, ENs create double-strand breaks (DSBs) at the targeted DNA sequence which are repaired by nonhomologous end joining (NHEJ) or homology-directed recombination (HDR) mechanisms. ENs have been used successfully for targeted mutagenesis, gene knockout, and multisite genome editing (GenEd) in model plants and crop plants such as cotton, rice, and wheat. Recently, cotton genome has also been edited for targeted mutagenesis through CRISPR/Cas for improved lateral root formation. In addition, an efficient and fast method has been developed to evaluate guide RNAs transiently in cotton. The targeted disruption of undesirable genes or metabolic pathway can be achieved to increase quality of cotton. Undesirable metabolites like gossypol in cottonseed can be targeted efficiently using ENs for seed-specific low-gossypol cotton. Moreover, ENs are also helpful in gene stacking for herbicide resistance, insect resistance, and abiotic stress tolerance
    corecore